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The methylation of DNA regulates gene expression. On cell division the methylation state of the DNA is typically inherited 
from parent to daughter cells. While the chemical bond between the methyl group and the DNA is very strong, changes to the 
methylation state do occur and are observed to occur rapidly in response to external stimulus. The loss of methylation can be active 
where enzyme physically breaks the bond, or passive where on cell division the newly constructed strand of DNA is not properly 
inherited. 

Here we present a mathematical model of single locus passive demethylation for a dividing population of cells. The model 
describes the heterogenity in the population expected from passive mechanisms. We see that even when the site specific probabilities 
of passive demethylation are independent, conservation of methylation on the inherited strand gives rise to site-site correlations of 
the methylation state. We then extend the model to incorporate correlations between sites in the locus for demethylation rates. 
Biologically, correlations in demethylation rates might correspond to locus wide changes such as the inability of methyltransferase 
to access the locus. We also look at the effects of selection on the multicellular population. 

The model of passive demethylation not only provides a tool for measurement of parameters in loci-specific cases where passive 
demethylation is the dominant mechanism, but also provides a baseline in the search for active mechanisms. The model tells 
us that there are states of methylation inaccessible by passive mechanisms. Observation of these states constitutes evidence of 
active mechanisms, either de novo methylation or enzymatic demethylation. We also see that selection and passive demethylation 
combined can give rise to a stable heterogeneous distribution of gene methylation states in a population. 
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1. Introduction 

The methylation of DNA is an epigenetic mechanism of gene 
regulation H] IH |3]. It is typically inherited from parent cell 
to daughter upon cell division, but there are notable epimuta- 
tional divergences. These divergences give rise to tissue spe- 
cific methylation patterns 13] IDE], changes in gene regulation 
in response to stimulus |7] [HI El HQl HT) , and in aberrant cases, 
cancer and developmental diseases (31 [T2j [T3] . Understanding 
the population dynamics of these epimutations is thus critical 
to understanding a number of important biological processes. 

In vertebrates, methylation occurs primarily on the cytosines 
of CpG dyads, where a carbon-carbon bond connects the methyl 
group to the cytosine. These sites are symmetric such that a 
methylation site on the Watson strand will have a counterpart 
on the reverse compliment Crick strand. 

On cell division, daughter cells will have a methylated par- 
ent strand and a newly constructed daughter strand which is 
initially unmethylated. Inheritance of the methylation state of 
the parent is performed by maintenance methylation where the 
enzyme DNMT1, finding a site on the parent strand that is 
hemimethylated (one cystine in the dyad methylated, the other 
unmethylated), adds a methyl group to the corresponding site 
on the daughter strand. 

Epimutations are classified as either passive or active. In 
passive epimutations maintenance methylation does not occur 



leaving the dyad hemimethylated. In active epimutations either 
a methyltransferase adds a methyl group (de novo methylation) 
or enzymatic activity leaves an unmethylated cytosine where a 
methylated one had been (active demethylation) lfT4l[T5ll . The 
loss of a methyl group without enzymatic action is thermo- 
dynamically unlikely due to the strength of the carbon-carbon 
bond and the ability of enzymes to break this bond is in dis- 
pute lTT4l[T3Tl . Active demethylation is typically assumed to take 
place via base excision and repair. 

Previous population epigenetic models of methylation dy- 
namics have looked at maintaining tissue specific methylation 
patterns |[T6l [T71 [181 [19l . These models have made the approx- 
imation that the methylation status at each site is independent 
of the other sites. In maintenance methylation this approxima- 
tion is useful. In this manuscript we show that for loci-specific 
demethylation occurring in response to external stimuli, the site 
independent approximation omits important features. 

In this manuscript we model the population epigenetics of 
passive demethylation. We study a dividing population of cells 
with a multisite locus where passive demethylation can oc- 
cur. The model keeps track of site-site correlations which re- 
sult from the initially methylated state and the strength of the 
carbon-carbon bond. The result of this is a heterogeneous pop- 
ulation of cells with a distribution of methylation patterns. The 
model is extended to incorporate other sources of site-site cor- 
relations and to include selection. 
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tive demethylation. De novo methylation is estimated to occur 
10 to 50 times less frequently than maintenance methylation 
IfTSII and its omission is a simplifying approximation. While 
there is evidence of active demethylation Q, the mechanisms 
are not yet well understood. The model in this paper thus also 
serves as a baseline such that deviations from it, suggest signa- 
tures of active mechanisms. 



2. Model 

Here we construct a multicellular model of a single locus 
with multiple methylation sites. We neglect recombination as 
all sites are assumed to be close enough that it is unlikely to 
occur. The model keeps track of the fractions of cells in all 
possible methylation states and examines how these fractions 
change with the number of cell divisions. We first construct a 
single site model. The single site model is equivalent to those 
previously studied lfl6l [TTl [T8l [T9l . We then construct multisite 
models illustrating the inaccuracy of site-site independence in 
the dynamic case. We then extend the model to include not 
only the site-site correlations in state, but where the error rates 
on the sites are correlated as well. We also extend the model to 
examine the effects of selection on the population. 

For a single site (CpG dyad) there are four possible states: 
methylated, hemimethylated Watson polarity, hemimethylated 
Crick polarity, and demethylated. We use the following sym- 
bols to illustrate these states: 

m Methylated 

w Hemimethylated Watson Polarity 
c Hemimethylated Crick Polarity 
(^J) d Demethylated 

Our model considers the population after maintenance 
methylation has occurred. It is typically assumed that main- 
tenance methylation occurs immediately following DNA repli- 
cation 11201 . This model therefore describes cells in the GO or 
Gl phases. 

Without demethylase, after division the methyl groups on 
the inherited parent strand remain. Maintenance methylation 
occurs at sites on the newly constructed strand if the site on 
the parent strand is methylated. Occasionally maintenance 
methylation fails to methylate the site on the newly constructed 
daughter strand leaving the site in a hemimethylated state, (Wat- 
son polarity if the parent strand was the Watson strand, and 
Crick polarity if the parent strand was the Crick strand). We 
refer to this omission as an error, though it may be biologically 
advantageous for this to occur, and label the probability of an 
error occurring at a single site /u. This sets the probability of 
maintenance methylation occurring to have rate 1 - /i. 

The possible transitions for a single site from a parent to the 
pair of daughters are shown in Fig. [T] along with the probabili- 
ties of each outcome. We see that if the parent is in the methy- 
lated state there can be no daughter in the demethylated state, 
that a parent in either polarity hemimethylated state will give a 



single demethylated daughter cell, and that demethylated sites 
are trapping, having only daughters in the demethylated state. 
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Figure 1 : Possible transitions on cell division with probability of each outcome, 
where the probability of a daughter having a hemimethylated state is given by 
\i and having a methylated state is 1 — fx. We also see that the totally methylated 
state (bottom) is trapping having no transitions to other states. 



Multisite dynamics will require us to analyze the inheritance 
of the Watson and Crick strands from the parent separately, 
though the dynamics are symmetric. For the single site, tak- 
ing the daughters from the top on each division in Fig.[T]for the 
Watson strand and those daughters on the bottom for the Crick 
strand we segregate as in Fig. |2j 
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Figure 2: The transitions with fractional rates for the daughters inheriting the 
Watson strand and those inheriting the Crick strand. 



From these transitions we can construct transition matrices. 
With the states labeled as basis vectors according to: 
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the above transitions give us the Watson strand transition ma- 
trix: 
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and the Crick strand transition matrix: 



2.2. Multisite Models 
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2.1. One Site Model 

We now use the transition matrices to construct a probabilis- 
tic model for a single site. If we describe the fractions of cells 
with each methylation state at generation g as a vector P m (g), 
e.g.: 

Pr(m\g) = m ■ P m (g), (4) 

gives the fraction of cells that are methylated, the vector at g + 1 
is given by: 



P {1, (g+l) = ^[W + C]P w (g) 
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where we have added superscripts to the transition matrix and 
state vector to indicate that it is for a single site model. The 
model assumes that at each generation every cell divides and 
that there is zero cell death. 

Starting with all cells methylated (P(0) = m) and taking 
= 0.15 the model generates the dynamics shown in Fig. [3] 
We see from this figure that the fraction of methylated sites de- 
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Figure 3: The fractions of sites with each methylation state as a function of 
generation number with fx — 0.15 and initially all sites in the population methy- 
lated. Only one of the two hemimethylated states is shown as the probabilities 
of the two hemimethylated states are equivalent. 



clines with generation number. This is due to the exponential 
growth in population size. We also see that the time-scale of the 
transition from methylated to demethylated is set by l//x 



Here we consider a single gene with multiple sites that can 
be methylated. We do not consider recombination as the rate 
of recombination on a single gene is too low. We describe an 
n site sequence using the Rronecker product of the single site 
state vectors. Given a sequence {s\, s 2 , . . . , s„] where each s,- 
is one of the four states s,- = m, w,c or d corresponding to the 
states described above, the vector describing the sequence is 
given by: 



{S\, S 2 , S,i] = Si ® s 2 8> . 



(8) 



and we use the curly braces above throughout the text as an 
abbreviation of the Rronecker products. 

The Rronecker product is defined for two vectors of length 4 
according to: 

a\b\ 
d]b 2 

{a, b) = ajb\ , (9) 
a 2 b 2 

04/74 

a vector of length 16. Similarly for two 4x4 matrices the Rro- 
necker product generates a 16 x 16 matrix defined by: 



[A,B] 
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As an example, the two site sequence {m,m} is given by 
(1, 0, . . . , 0), \m,c) is given by (0, 1, 0, . . . , 0) and so on. 

Using this formalism the probability of a multisite sequence 
with n sites, at generation g being in the state described by 
{s\, s 2 , s„] is given by: 

Misu s 2 , s n }\g) = {s u s 2 ,..., Sn ] ■ ? {n Xg). (11) 

We can also see that the multisite probability vector is given by 
the Rronecker product of the individual site probability vectors: 



P (n) (g) = {P\\g\P\ l) (g\...,P^\g)} 



(12) 



The dynamics of the system are now obtained with a transition 
matrix T {n) that given P (n \g) returns P ( ">(g +1). 

2.2.7. ID Omits Essential Features of Passive Demethylation 

Previous multisite models of methylation dynamics have 
studied large scale maintenance of methylation patterns lfl6l 
Q/71 HU [19) . These models have used an approximation where 
the probabilities of different states at each site are indepen- 
dently distributed (ID). This approximation has been useful in 
the study of maintenance, but here we show that even when the 
probability of error is independent at each site, the ID approxi- 
mation omits important features in the system which result from 
the conservation of carbon-carbon bonds. In dynamic scenarios 
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of passive demethylation the conservation of these bonds gives 
rise to heterogeneity in the population, site-site correlations, 
and the absence of opposite polarity hemimethylated sites. 

The ID approximation for multisite models is achieved by 
taking the Kronecker product of the one site transfer matrix 
given in Eq. [7] 



rd)i 



Taken with Eq.|7]and 12 we have: 
P (n) (g+l) ~ T (n) P (n \g) 



(13) 



(14) 



{P x }\g),...,P^{g)){\5) 



= {Tf ) P^\g),...,T^P^(g)} (16) 
= {P^\g+\),...,P^\g + \)), (17) 



where in Eq. 16 we have used the mixed product property of 
the Kronecker product (i.e. (A ® B)(C ® D) = AC ® BD). The 



probability of a specific sequence is then given by Eq. 1 1 



Pr«s 1 ,...s n }\g + 1) « {s u ..., s „}-P (n \g+l) (18) 

n 

= Y[si-Pf\g+l), (19) 
/=i 

which is a more common notation for an ID approximation. 

The ID approximation neglects correlations between sites 
that are the result of bond conservation. As the methylation sta- 
tus of a site is a categorical variable there is no simple summary 
statistic to illustrate the effect of the maintenance correlations. 

In the absence of a summary statistic for categorical variables 
consider the dynamic in Fig. [4] Here we illustrate the first 3 di- 
visions of a system with the limiting case of fi — 1 for both 
sites, starting with {si,S2j = {m,m}. This limiting example is 
deterministic and the first division results in two hemimethy- 
lated strands ({w, w} and {c, c}). Subsequent divisions maintain 
the hemimethylated strands while adding demethylated strands. 

We see in this example that if a methyl group is detected at 
the first site in the locus it predicts with certainty that the second 
site will also be methylated. Bond conservation is the source of 
these site-site correlations, and constitutes an asymmetric cell 
division. 

For a single site with /j. = 1 , starting with P(Q) = m repeated 
application of Eq.|6]gives us P(3) = (0, 1/8, 1/8, 3/4) in corre- 
spondence with the single sites illustrated in Fig. [4] 

For multiple sites one can quickly see that the ID approxi- 



mation in Eq. 19 yields poor correspondence with what is illus- 



trated. For example the ID approximation yields a probability 
of 9 / 1 6 in this case for observing the sequence { d, d] in the third 
generation, the probability of which is easily seen from the fig- 
ure to be 3/4. This underestimate on Pr({ii, if}|3)is balanced by 
overestimating the probabilities for unobserved states such as 
\c,d). 

2.2.2. Exact Model 

To analyze passive demethylation without using the ID ap- 
proximation one must consider the transfer matrices for the 
Watson and Crick W and C strands separately. Performing a 



Three-Generations with ^=1 



Figure 4: A limiting example to underscore the importance of correlations be- 
tween sites. Because the carbon-carbon bonds are not expected to be broken 
in passive demethylation the parent strands are maintained in the population 
(shown at edges of population). This maintenance is responsible for site-site 
correlations even when errors occur with independent probabilities. With /i < 1 
these strands are still maintained and serve as a continual source of cells with 
high numbers of methyl groups. The conservation of these strands constitutes 
asymmetric cell division. We also see there are no states with a combination 
of opposite polarity hemimethylated sites (w and c) as is predicted by indepen- 
dently distributed models. 



multi-site analysis analogous to the one in Fig. [2j we find that 
the n site Watson strand transition matrix W (n> is given by: 



and similarly: 



w (n) = {w u w 2 ,...,w„} 



CS n) = {C x ,Ci 



(20) 



(21) 



We note here that the /j values for the different sites do not 
need to be equivalent in constructing these matrices. Combin- 
ing these two matrices gives us the exact multisite transfer ma- 
trix: 

r W = I[W« + c«]. (22) 



For the ju = 1 case illustrated in Fig.[4]Eq. 22 reproduces the 
correct fractions as well as in the general case. 

Algebraically, the difference between the expressions in 



Eq. 22 and 13 can be found using the identities of the Kronecker 
product \A,(B + C)} = {A,B} + {A,C}, and {(A + B),C] 
{A, C) + {B, C). We find for the T {2) case generated by Eq.[l3 



<i) t(Di 



{(W + C), (W + C)} 



(23) 



[{W, W} + {C, C) + {C, W} + {W, C}] (24) 



1 



2 r (2) + -[{c,w} + {w,c}]. 



(25) 



We therefore see that there are cross-terms in the ID approxi- 
mation not found in the exact model. 

2.3. Correlated Errors 

Passive demethylation is a result of the maintenance methy- 
lation enzymes failing to correctly methylate the corresponding 
state on the newly constructed strand of DNA. We may there- 
fore suspect that if this enzyme is in low quantity or otherwise 
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blocked from accessing the locus, that the chance of error at all 
sites in the locus is increased. This generates additional corre- 
lations between sites. 

To model this effect we consider multiple possible values of yU 
each denoted with a subscript Qxi,fi2, ■ ■ ■)■ When a cell divides 
there is probability given by p(/-ij) that the daughter cells are 
generated with error rate at every site in the locus equal to fij. 
We also have the expected value of fi given by: 



(26) 



The quantity pQij) equivalently denotes the expected fraction 
of cells to divide with the error rate In the case of all sites 
in a dividing cell having equivalent error rates, this allows us to 
write down the transfer matrix as the weighted sum of transfer 
matrices: 

Tf^^P&j^&j), (27) 

7=1 

where we have made the individual transfer matrices an explicit 
function of the error rates. For a continuous distribution of /i 
values we can extend the sum to an integral: 



r(») 
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Jo 



(28) 



When the methylation error rates differ from site to site, the 
model is more complex but can be constructed in a similar man- 
ner. One would have a probability of a set of error rates and take 
the analogous sum over all sets considered. 

2.4. Selection 

Methylation regulates gene expression and in many cases this 
can affect cell survival. This will bias the distribution of cells. 
To incorporate selection we can multiply the transition matrix 
by a diagonal matrix 5 <n) such that: 



(29) 



gives the survival fraction for cells with sequence 
{su s 2 ,..., s„). 

Multiplication by the matrix S (n) leaves the probability vector 
un-normalized, such that the entries no longer sum to one. In 
order keep the probability vector normalized when selection is 
present we must divide by the sum of the elements of P^"\g+ 1). 
This gives us the equation: 



P w (g + 1) 



1 



\s (J1) p( n \g)\ 



j.(n)£(«)p(«)(g) 



(30) 



where the bars in the pre-factor indicate taking the norm of the 
resulting vector. 



Eq. 30 describes selection that occurs before division. In gen- 
eral 5™and do not commute. If we are instead concerned 
with selection that occurs after division we are required to mod- 
ify the equation such that: 



P ( "\g + 1) 



1 



£(«)y(«)p(«)(g-J 



S {n) T {n) P (n) (g). (31) 



In general the order of operations affects the interpretation of 
data. 

3. Results 

Figj5]illustrates the typical behavior of the models given by 
Eq. 22 (top) and Eq. 27 (bottom). Here we have drawn 15 sam- 
ples from the distributions after 2, 10 and 20 divisions. We 
started the simulation with all sites in all cells being methylated. 
In the top samples, all cells divide with error rate fi = 0.15, 
while in the bottom we have 3/4 of cells dividing with /j = 
and 1/4 of cells dividing with fi = 0.6 thus maintaining the 
same Ji as in the top. The correlated errors are most apparent in 
the clustering of the methyl groups in the samples. 

1 5 samples at three generations 
2 divisions 10 divisions 20 divisions 




Figure 5: 15 samples of a five site model taken at three different generations us- 
ing Eq. |22| (top) and Eq. |27| (bottom). The samples are reordered in the columns 
to guide the eye. In the top samples, all cells divide with the site error rate of 
H = 0. 1 5 , while at the bottom 3/4 of cells divide with fi = and 1/4 with \i = 0.6 
preserving Ji = 0.15. The correlated errors are most apparent in the clustering 
observed after 10 divisions. We also see that there are no samples containing 
opposite polarity hemimethylated states. 



3.1. Inaccessible States 

In Fig. [5] we see that there are no samples having opposite 
polarity hemimethylated states (i.e. no states with both Watson 
and Crick hemimethylated sites). These states are strictly disal- 
lowed by passive demethylation. 
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To illustrate the origin of this effect in the model, Fig. [6] 
shows the matrix of allowed transitions for < ft < 1 . Here, 
for a two site model, black boxes indicate a possible transition 
from the parent column to the daughter row. From this figure 
we see that there are two states that are inaccessible having op- 
posite polarity hemimethylated sites. No parent state is capable 
of generating either of these states upon division, not even these 
states themselves. 



Parent Strands 




Figure 6: Illustration of the matrix of allowed transitions (filled squares) from 
parent (column) to daughter (row) given by Eq. |22| for two sites with < /u < 1. 
We see that the state with both sites demethylated feeds only into itself, but 
more notably there are two states that are inaccessible by passive means. In n 
site models any state with opposite polarity hemimethylated sites is disallowed. 
Observation of such states is evidence of active mechanisms. 

The ID model yielded finite probabilities for these states, but 



from the full model in Eq. 22 we see that no passive demethy- 
lation can result in these states. With higher numbers of sites as 
in Fig. [5] any state with opposite polarity hemimethylated sites 
is strictly inaccessible by passive demethylation. 

3.2. Summary Distributions 

For the full model we consider all possible states. At four 
states per site and five sites this gives 4 5 = 1024 states. As 
a summary of the features of the model we consider the total 
number of methyl groups on the Watson strand of each cell. As 
the dynamics of the Watson and Crick strands are symmetric in 
the model the sum of methyl groups on the Crick strand will 
have an equivalent distribution when there is no selection. 

For an n site system the reduced representation has n+l states 
ranging from zero methyl groups to n methyl groups. Fig. [7] 
(A) shows the probability distribution of cells for n — 5 as a 
function of the total number of methyl groups on the Watson 
strand after 2 divisions, 10 divisions, and 20 divisions with fi = 
0.15. 

In Fig.|7](B) we see the effects of correlated errors (Eq. [27| 
on this distribution. This distribution like the bottom samples in 
Fig. |5]is bimodal, having 3/4 of cells dividing with yU; = and 
1/4 dividing with fj.2 = 0.6, preserving Jl = 0.15. Here we see 
that the distribution after two divisions has a marked divergence 



Distributions of total methyl groups 
on Watson strand 
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C. Constant fx with Selection 
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Figure 7: The distribution of methyl group numbers on the Watson strands after 
2 divisions, 10 divisions, and 20 divisions. A shows distributions when all sites 
in all cells have fi = 0.15. B maintains /J = 0.15 but most cells have no error 
and a quarter of the cells divide with a high error rate of fi = 0.6. C shows 
the case with constant /i and selection against demethylated cells. We see in 
C that the distribution is stable having changed very little from 10 divisions to 
20 divisions. In all simulations the initial fraction of cells with all five sites 
methylated is 1 (not shown). Without selection the distributions tend to low 
methyl group numbers. The bimodal /i distribution model is characterized by 
fewer cells with intermediate numbers of methyl groups. 
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from the independent errors model, having very few cells with 
a single loss of methylation on the Watson strand. After 10 
divisions the distribution is bimodal in comparison to the single 
peaked distribution from the independent errors model. 

3.3. Stable Distributions 

Fig. [4] illustrates the conservation of methylation on the 
grandmother strands when fx = \. The grandmother strands 
are those that where present in the population when expansion 
began which we assume to be completely methylated. Without 
enzymatic action to remove the methyl group, these strands re- 
main methylated indefinitely. When ft < 1, as long as the cells 
with these strands survive, the grandmother strands act to seed 
the population with new methylated cells. 

The demethylated state is trapping. Once a site is demethy- 
lated in this model, all descendants will also be demethylated. 
This acts to dilute the effects of the grandmother cells seeding 
the population. With selection however a stable heterogeneous 
distribution can be achieved. 



As an example of this we use Eq. 30 and construct a model 
such that cell fitness is a linear function of the number of methyl 
groups on the Watson strand. In this model, states with 5 methyl 
groups on the Watson side have S? - 1. With 4 methyl groups, 
S^f = 4/5, and so on. This model is not symmetric between the 
Watson and Crick strands as we are considering selection only 
acting on the state of the Watson strand. 

Stable distributions are found by computing the eigenvectors 
of the matrix given by T (n) S {n) . There are in general 4" eigen- 
vectors for this matrix. Eigenvectors with negative valued com- 
ponents aren't biologically meaningful, and if the error rates 
(fj) are equivalent at multiple sites, then there is significant de- 
generacy. Here we are interested in the eigenvector associated 
with the dominant eigenvalue which a population starting com- 
pletely methylated will converge to. 

In Fig.[7|C) we show the summary distribution for the model 
with selection. Here we see very little change in the shape of the 
distribution from 10 divisions to 20 divisions. This illustrates 
that survival of the grandmother strand can seed the population 
and maintain a constant and heterogeneous distribution of cells. 

4. Discussion 

All stochastic systems beginning in the same state will be 
initially correlated (autocorrelation) but this effect will decay 
as the two sites diverge. Models of demethylation using site 
independent distributions starting with all sites methylated also 
have this type of correlation. 

We have seen that the maintenance of carbon-carbon bonds 
gives rise to an additional type of correlation that is important 
in the dynamics of passive demethylation. While the methy- 
lation status of a site is categorical and thus has no standard 
summary statistic for analyzing the correlations between sites, 
we see that from the extreme example in Fig.|4]that observation 
of a methyl group on the first site predicts with certainty that 
there will be one on the second site. As the error rate for main- 
tenance methylation decreases the effect is still present but less 
strong. 



Additionally we see from Fig.[6]the inaccessibility by passive 
demethylation of opposite polarity hemimethylated states. This 
strict site-site anticorrelation is present for all error rates. Site 
independent distributions (ID) fail to capture this effect. Op- 
posite polarity hemimethylated sites have been observed ETI 
and the observation was previously used to infer that de novo 
methylation could result in a hemimethylated state [18|. De- 
pending on the specific mechanisms of the process this may also 
be a possible outcome of active demethylation. The detection 
of hemithylated states requires an additional step in sequencing 
[22 , 2 1 ] , but when questions are present of active versus passive 
mechanisms arise this extra step can might help. 

These correlations by bond maintenance are an example of 
asymmetric cell division. As there are no opposite polarity 
hemimethylated dyads, if a parent having only Watson polarity 
hemimethylated sites divides, the daughter receiving the Crick 
strand will be unmethylated at each site that was hemimethy- 
lated in the parent. The daughter inheriting the Watson strand 
will either be methylated at these sites or hemimethylated. This 
asymmetry between daughters generates heterogeneity in the 
population. This heterogeneity is explored in Fig. |7jA). 

In combination with selection, the heterogeneity resulting 
from asymmetric cell division can give rise to a stable distri- 
bution (Fig.|7p). This might correspond to heterogeneity found 
in homeostatic conditions. The model then allows one to make 
connections between dynamics and cross-sectional data. Ob- 
servation of a heterogeneous distribution of methylation states 
might also be used as a biomarker for the stimulus that triggered 
demethylation. 

The simulations in this paper have all assumed that cells are 
initially methylated and begin to loose methylation upon the 
first division with a constant rate li for the remainder of the 
simulation. This corresponds to a scenario where an external 
stimulus begins at the start of the simulation and has a constant 
effect throughout. An alternate method might be to consider 
time dependent parameters. The exact model in this paper can 
easily be altered for such scenarios. 

Passive demethylation could either be a global process where 
the entire cell is deficient in the maintenance methyltransferase 
or be a local effect where a particular locus is partially or 
completely sequestered from the maintenance enzymes. In the 
global case we would expect that the entire genome would be 
loosing methylation. This global demethylation typically re- 
sults in cells that are unable to survive J3). 

The mechanics of active mechanisms of demethylation is still 
not well understood. If active demethylation acts to leave a 
site hemimethylated than we expect to find opposite polarity 
hemimethylated sites in the locus to be a key signature. Addi- 
tionally we expect that since active mechanisms don't conserve 
carbon-carbon bonds, there will not be maintenance of grand- 
mother methylated strands, asymmetric cell division won't be 
as strong an effect, and the population will be more homoge- 
neous having narrower distributions than those in Fig. [7jA). 

The model in this paper provides a tool for measuring passive 
demethylation rates in regulatory scenarios. Additionally, this 
model may predict if active mechanisms are playing a roll, and 
in the way that a model of genetic drift provides a necessary 
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baseline for measuring the effects of selection, this model can 
be used in elucidating mechanisms of active demethylation. 
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